Convergence Problems of General-Sum Multiagent Reinforcement Learning

نویسنده

Michael H. Bowling

چکیده

Stochastic games are a generalization of MDPs to multiple agents, and can be used as a framework for investigating multiagent learning. Hu and Wellman (1998) recently proposed a multiagent Q-learning method for general-sum stochastic games. In addition to describing the algorithm, they provide a proof that the method will converge to a Nash equilibrium for the game under specified conditions. The convergence depends on a lemma stating that the iteration used by this method is a contraction mapping. Unfortunately the proof is incomplete. In this paper we present a counterexample and flaw to the lemma’s proof. We also introduce strengthened assumptions under which the lemma holds, and examine how this affects the classes of games to which the theoretical result can be applied.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Friend-or-Foe Q-learning in General-Sum Games

This paper describes an approach to reinforcement learning in multiagent general-sum games in which a learner is told to treat each other agent as either a \friend" or \foe". This Q-learning-style algorithm provides strong convergence guarantees compared to an existing Nash-equilibrium-based learning rule.

متن کامل

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

Reinforcement learning turned out a technique that allowed robots to ride a bicycle, computers to play backgammon on the level of human world masters and solve such complicated tasks of high dimensionality as elevator dispatching. Can it come to rescue in the next generation of challenging problems like playing football or bidding on virtual markets? Reinforcement learning that provides a way o...

متن کامل

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...

متن کامل

State Elimination in Accelerated Multiagent Reinforcement Learning

This paper presents a novel algorithm of Multiagent Reinforcement Learning called State Elimination in Accelerated Multiagent Reinforcement Learning (SEA-MRL), that successfully produces faster learning without incorporating internal knowledge or human intervention such as reward shaping, transfer learning, parameter tuning, and even heuristics, into the learning system. Since the learning spee...

متن کامل

Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

In this paper we adopt general sum stochas tic games as a framework for multiagent re inforcement learning Our work extends pre vious work by Littman on zero sum stochas tic games to a broader framework We de sign a multiagent Q learning method under this framework and prove that it converges to a Nash equilibrium under speci ed condi tions This algorithm is useful for nding the optimal strateg...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Convergence Problems of General-Sum Multiagent Reinforcement Learning

نویسنده

چکیده

منابع مشابه

Friend-or-Foe Q-learning in General-Sum Games

Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games

A Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem

State Elimination in Accelerated Multiagent Reinforcement Learning

Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

عنوان ژورنال:

اشتراک گذاری